Knowledge integration for improving performance in LVCSR

نویسندگان

Chen-Yu Chiang

Sabato Marco Siniscalchi

Sin-Horng Chen

Chin-Hui Lee

چکیده

This paper presents a knowledge integration framework to improve performance in large vocabulary continuous speech recognition. Two types of knowledge sources, manner attribute and prosodic structure, are incorporated. For manner of articulation, six attribute detectors trained with an American English corpus (WSJ0) are utilized to rescore hypothesized phones in word lattices obtained by a baseline ASR system. For the prosodic structure, models trained with an unsupervised joint prosody labeling and modeling (PLM) technique using WSJ0 are used in lattice rescoring. Experimental results on the American English WSJ word recognition task of the Nov92 test set show that the proposed approach significantly outperforms the baseline system that does not use articulatory and prosodic information. The results also demonstrate the effectiveness and usefulness of the PLM technique in constructing prosodic models for American English ASR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tone information as a confidence measure for improving Cantonese LVCSR

Cantonese, a syllabically paced, southern Chinese dialect, is also a tonal language. A Cantonese syllable can have up to 9 different tone patterns which are lexically important. In this paper after reviewing major approaches to incorporating tone information into a large vocabulary continuous speech recognition (LVCSR) system, we propose two schemes to employ the tone information as a confidenc...

متن کامل

A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines

A bottom-up, stepwise, knowledge integration framework is proposed to realize detection-based, large vocabulary continuous speech recognition (LVCSR) with a weighted finite state machine (WFSM). The WFSM framework offers a flexible architecture for different types of knowledge network compositions, each of them can be built and optimized independently. Speech attribute detectors are used as an ...

متن کامل

Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task

This paper presents speech-driven Web retrieval models which accept spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speechdriven Web retrieval. We experimentally evaluated the techniques of combining outputs of multiple LVCSRmodels in recognition...

متن کامل

Towards High Performance LVCSR in Speech-to-Speech Translation System on Smart Phones

This paper presents the endeavors to improve the performance of large vocabulary continuous speech recognition (LVCSR) in speechto-speech translation system on smart phones. A variety of techniques towards high LVCSR performance are investigated to achieve high accuracy and low latency given constrained resources. This includes one-pass streaming mode decoding for minimum latency, acoustic mode...

متن کامل

Integrating Hypotheses of Multiple Recognizers for Improving Mandarin LVCSR Performance

In this paper, we investigate how to improve Mandarin LVCSR performance by integrating multiple hypotheses from recognizers running in parallel. Different recognizers are trained by employing: (1) different phone sets, (2) different front-ends, and (3) different training sets. Nbest hypotheses are merged into a character transition network (CTN) and ROVER is used to select the final recognition...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Knowledge integration for improving performance in LVCSR

نویسندگان

چکیده

منابع مشابه

Tone information as a confidence measure for improving Cantonese LVCSR

A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines

Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task

Towards High Performance LVCSR in Speech-to-Speech Translation System on Smart Phones

Integrating Hypotheses of Multiple Recognizers for Improving Mandarin LVCSR Performance

عنوان ژورنال:

اشتراک گذاری